Algorithmic fairness datasets: the story so far

نویسندگان

چکیده

Data-driven algorithms are studied in diverse domains to support critical decisions, directly impacting people's well-being. As a result, growing community of researchers has been investigating the equity existing and proposing novel ones, advancing understanding risks opportunities automated decision-making for historically disadvantaged populations. Progress fair Machine Learning hinges on data, which can be appropriately used only if adequately documented. Unfortunately, algorithmic fairness suffers from collective data documentation debt caused by lack information specific resources (opacity) scatteredness available (sparsity). In this work, we target surveying over two hundred datasets employed research, producing standardized searchable each them. Moreover rigorously identify three most popular datasets, namely Adult, COMPAS German Credit, compile in-depth documentation. This unifying effort supports multiple contributions. Firstly, summarize merits limitations adding recent scholarship, calling into question their suitability as general-purpose benchmarks. Secondly, document hundreds alternatives, annotating domain supported tasks, along with additional properties interest researchers. Finally, analyze these perspective five important curation topics: anonymization, consent, inclusivity, sensitive attributes, transparency. We discuss different approaches levels attention topics, making them tangible, distill set best practices resources.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Curcumin: the story so far.

Curcumin is a polyphenol derived from the herbal remedy and dietary spice turmeric. It possesses diverse anti-inflammatory and anti-cancer properties following oral or topical administration. Apart from curcumin's potent antioxidant capacity at neutral and acidic pH, its mechanisms of action include inhibition of several cell signalling pathways at multiple levels, effects on cellular enzymes s...

متن کامل

MetateM: The Story so Far

METATEM is a simple programming language based on the direct execution of temporal logic statements. It was introduced through a number of papers [35,2,3] culminating in a book collecting together work on the basic temporal language [5]. However, since that time, there has been a programme of research, carried out over a number of years, extending, adapting and applying the basic approach. In p...

متن کامل

Estrogen and cognition: the story so far.

ESTROGEN is a complex gonadal hormone that reportedly exhibits numerous neurobehavioral effects in both human volunteers and animals. Evidence from basic science and clinical research demonstrates that estrogen can enhance cognitive function of healthy older women as well asthose with Alzheimer’s disease (AD) (1–3). Although the biology of estrogen strongly supports its neuromodulatory and neur...

متن کامل

FCA and IR: The Story So Far

The application of Formal Concept Analysis (FCA) to Information Retrieval (IR) is twenty-five years old. Over this period, a number of papers have explored the potentials of FCA for various information finding tasks while several system prototypes have been made available for experimentation and testing. In this talk we survey what has been achieved so far, discussing lessons and implications f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Data Mining and Knowledge Discovery

سال: 2022

ISSN: ['1573-756X', '1384-5810']

DOI: https://doi.org/10.1007/s10618-022-00854-z